ON THE LIMIT DISTRIBUTIONS OF SOME SUMS OF A RANDOM 

MULTIPLICATIVE FUNCTION 



ADAM J HARPER 



Abstract. We study sums of a random multiplicative function; this is an example, 
of number-theoretic interest, of sums of products of independent random variables 
(chaoses). Using martingale methods, we establish a normal approximation for the 
sum over those n < x with k distinct prime factors, provided that k = o(logloga;) 
as a; — ?> oo. We estimate the fourth moments of these sums, and use a conditioning 
argument to show that if k is of the order of magnitude of log log x then the analogous 
normal limit theorem does not hold. The methods extend to treat the sum over those 
n < X with at most k distinct prime factors, and in particular the sum over all n < x. 
We also treat a substantially generalised notion of random multiplicative function. 

1. Introduction 

Let ep be a sequence of independent Rademacher random variables, indexed by primes 
p; that is, let 

P(ep = 1) = P(ep = -1) = 1/2, 

independently for each p. We construct a random multiplicative function /, in the sense 
of Halasz ^ and Wintner [11] , by defining 



Y[p\n if ^ is squarefree 
otherwise 



In 1944, Wintner [19] studied the behaviour of the summatory function M{x) : = 
^n<x /(^)' a heuristic for the behaviour of Mertens' function 3. He showed, amongst 
other things, that for each e > one almost surely has 



M{x) = 0(x^/^+') as X ^ oo. 

This bound was improved by later authors, (see the discussion of Erdos in his unsolved 
problem papers [HE]), and the best currently known appears to be the 1982 result of 



Date: 30th November 2010. 

2000 Mathematics Subject Classification. Primary 11N64; Secondary 60F05, 60G42. 

The author is supported by a studentship from the Engineering and Physical Sciences Research Council 

of the United Kingdom. 

^The Mertens function is the summatory function of the Mdbius function /x(n), the multiplicative 
function taking value —1 on all primes, and supported on squarefree numbers only. As discussed, for 
example, in Chapter 14 of Titchmarsh [18], showing that X]n<x mC"-) ~ 0(x^/^+'^) for each e > is 
equivalent to proving the Riemann Hypothesis. 
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Halasz [8j: there is an absolute and effective constant A> such that, almost surely, 

M{x) = 0(v^e^^'°*5'°§"'°s'°*5'°s") as a; ^ oo. 

However, by this point the motivation for the problem had shifted somewhat, with 
Halasz [S] writing that "A deeper aspect... is to find out how the number-theoretic 
dependence among /(n) effects the magnitude, compared especially with the case of 
f{n) = ±1 being independent for all n" . This is also the attitude that we adopt. 

Whereas Halasz [S] was making a comparison with the law of the iterated logarithm, 
we will be interested in distributional properties of sums of f{n), comparing with the 
central limit theorem. Let u{n) denote the number of distinct prime factors of n, (so 
e.g. u{12) = 2), and for each A; G N define 



n<x,u](n)=k 

For these sums we have the following positive result. 

Theorem 1. Suppose that k > 1 is o(loglogx) as x oo. Then for each 2; G M, 

P (J^^^\x) < $(0) 
as X ^ 00, where $(2;) = e~^^^^dt is the standard normal distribution function. 

Theorem 1 is a refinement of a result of Hough fl2\ [13] . who showed by the method 
of moments that taking k > 1, k = o(log log log x) is permissible for a normal approxi- 
mation. The theorem will be proved in §4, and we postpone until then any motivation 
for studying subsums of this type (but see below). 

Readers may identify M^^) {x) as an example of so-called Rademacher chaos of order 
k, or as a generalised type of ?7-staizsiz(§. Our proof of Theorem 1 uses a martingale 
version of the central limit theorem, due to McLeish [H], which is an idea obtained 
by the author after reading the paper of Blei and Janson They apply martingale 
methods to study a general Rademacher chaos, but, in common with other articles on 
Rademacher chaos, do this when the order k is fixed. In our special case, we apply 
information about numbers with a given quantity of prime factors, and find we can let 
k{x) tend to infinity along with x. 

In the framework of martingale theory, the computations that allow us to deduce 
Theorem 1 imply other results about M'^^^(x). For example, combining the work of §4 
with a central limit theorem of Haeusler and Joos [7J, we obtain (roughly) a rate of 



{/-statistic of order k has the form ^""f"*' ^(^i(i)j "^i(fc)); where (Xi) is an i.i.d. sequence of 
random variables, h is a real-valued function, and the sum is over all tuples (j(l), ...,i{k)) of distinct 
numbers smaller than n. See the book of de la Pefia and Gine [3] for much more discussion of U -statistics 
and chaoses. 
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convergence O ((1 + z^)~^(/c/logloga;)^/^) in Theorem 1. We refer the reader to Hall 
and Heyde's book [9J for further discussion of the sorts of result that are possible. 

It is a classical result of Hardy and Ramanujan [lOj that "the normal order of the 
number of different prime factors of a number is log log n" , and in particular 

7^{n < X : \uj{n) — log log x| > (log log x)'^/''} = o{x) as x — i- oo. 

This means that the range of k allowed in Theorem 1 stops just short of telling us about 
sums of f{n) over 'typical' numbers. 

As we will discuss in §5, there is a clear change of behaviour of M^''\x) when k is 
comparable in size with log log x, compared to when k = o(loglogx). One aspect is 
that, if p is some 'not very large' prime, one has 

7^{n < X : u{n) = k(x),p\n} = o < x : uj{n) = k(x)}^ 

if k{x) = o(loglogx), whereas numbers with about log log x prime factors are divisible 
by such p in roughly the usual proportions 1/p. Thus, in the latter case, there are 
many with a large influence on the behaviour of M^''\x). This is evidenced by the 
following moment estimate: the reader should observe that the quantity L appearing is 
of the order of magnitude of log log x, so the quantity k is of the order of magnitude of 
fc/loglogx, whenever k < log^'^x, say. 

Proposition 1. Let k > 2, x > 3, and write 

mf\x) := EM^^\x)^ - 3. 

Also write L = L{k,x) := log log x — log A; — loglog(/c + 1), and k := k/L. There are 
constants C,6 > such that, provided x > C : 

(i) ifk < (loglogx)^-^ thenmf\x) > (k / \og{k + 2)f ; 

(ii) if k < (5 log x/ (log log x)^, then rn"^\x) ^ (/c/log/c)^. 

The first bound is stronger than the second roughly when k < (loglogx)^'^"*-^-', and 
otherwise the second bound extends the first. If e > 0, and k < (1/2 — e) log log x, 
one can use the methods of §4 to obtain upper bounds for rn"^\x) that agree, up to 
an e-dependent constant, with these lower bounds. For larger k it seems unlikely that 
Proposition 1 is sharp. 

The inflation of the fourth moment of M^^)(x), compared with EA^(0, 1)'^ = 3, sug- 
gests that the range of k in Theorem 1 may be best possible. However, it is easy to 
construct examples of random variables for which this intuition fails. In §6, we will 
prove that this is not the case for us. 
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Theorem 2. Let e, R be fixed such that < e < R, and suppose that for all large x, 

e log log a; < k{x) < i? log log x. 

Then Theorem 1 does not hold for M^'^\x) as x oo. 

The fact that the fourth moment of M^''\x) does not converge to 3 would establish 
Theorem 2, if some higher absolute moment was bounded as x — oo. See Chapter 8.1 
of Feller's book [6J. However, the author did not explore this approach, for two reasons: 
firstly, because such boundedness would very likely not hold on the whole range of k in 
Theorem 2 (by analogy with the situation for M(x)), and would certainly be difficult 
to establish. In general, it is interesting to consider how to disprove convergence in 
distribution when computing even moments does not help. Our proof of Theorem 2 uses 
a conditioning argument, which supplies upper bounds for some truncated moments of 
M^^\x). The inflation of EM^''^ (x)^ means that the distribution of M^''\x) has greater 
kurtosis than the standard normal distribution — that is, sharper concentration of most 
of the density around the mean, and heavier tails. The fact underlying Theorem 2 
is that, in this case, the tails are heavy enough to noticeably depress some truncated 
averages. 

The preceding arguments extend fairly easily to deal with some other types of sum. 
Introduce the notation 

n<x,uj{n)<k 

Ben Green suggested these sums to the author as an object of study, and for them we 
have the following result. 

Corollary 1. If M^^\x) is replaced hy M^-^\x), then: 

(i) Theorem 1 holds without change; 

(ii) Theorem 2 holds without the need for an upper hound on k, i.e. it is enough if 

k{x) > e log log a; 

for fixed e > 0. 

Notice that the complete sum M{x) is the same thing as M^-^\x), so in particular 
M{x)/^EM{xy does not converge in distribution to A^(0, 1). This confirms a heuristic 
of Chatterjee, expressed in Hough's paper [12] (and also see the very recent preprint [2] 
of Chatterjee and Soundararajan). See §6 for some more discussion of this. 

A few remarks should be enough to convince the reader that Corollary 1 is true. 
If = o(loglogx), then the number of integers with at most k — 1 prime factors is of 
smaller order than the number with exactly k prime factors. This is basically enough for 
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the extension of Theorem 1, whilst extending Theorem 2 is accomphshed by summing 
bounds obtained in the original proof. We provide a few more details of these arguments 
in §7. One could extend Proposition 1 in a similar way, but there seems to be little 
interest in doing this, as the lower bounds obtained would be rather weak. 

The reader may notice, when reading §§4-7, that the fact that are Rademacher 
random variables is not used in a very essential way. Our final result establishes a 
precise version of this principle. 

Theorem 3. Let tp he any sequence of independent random variables, also satisfying: 

(i) (symmetry) tp has the same distribution as —tp, for each p; 

(ii) (normalisation) ]E(ep) = 1 for each p; 

(iii) (bounded fourth moments) there is a constant C > such that ]E(ep) < C for 
all p. 

If a random multiplicative function is defined using the e^, then the results of Theorem 
1, Proposition 1, Theorem 2, and Corollary 1 still hold. 

The fact that Proposition 1 continues to hold is immediate, because lE(ep) > {E,{e^)Y = 
1, so the case of Rademacher random variables is worst possible for obtaining lower 
bounds on fourth moments. Obtaining Theorems 1 and 2 in general requires more 
work, and we sketch the necessary modifications in §8. We leave it to the reader to 
verify that Corollary 1 holds in the more general setting. 

One important case included in Theorem 3 is that of standard normal random vari- 
ables ep, leading to a random multiplicative function that is an example of Gaussian 
chaos, but we emphasise that there is no requirement even that the be identically 
distributed. 

We conclude this introduction with a brief list of further work suggested by our 
results. If we combine Theorems 1 and 2, we still have no information about the 
behaviour of M'^'^^ (x) if k{x) is not bounded by a constant multiple of log log x. Although 
very few numbers n < x have so many distinct prime factors, it would be nice to have 
a complete result. It seems likely, in view of Proposition 1, that the negative result in 
Theorem 2 could be extended onto the larger range. 

Another problem is to give a positive description of the limiting behaviour of M^^\x) 
when e log log a; < k{x) < i? log log x. It may be unreasonable to expect a simple limit 
distribution, and the type of theorem sought might say that one obtains the same 
limiting distribution for any ep as in Theorem 3. A reader interested in pursuing this 
might consult, for example, the book of de la Pena and Gine [3], who present similar 
theorems for [/-statistics. 
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A third line of work would be to return to the problem treated by Halasz [8J and 
others, of obtaining almost sure bounds for M{x). As remarked previously, (and also 
see §§2, 6), there are many random variables having a large influence on this sum, so 
it does not have the 'concentration' properties under which martingale methods perform 
nicely. Because of this, obtaining sharp bounds for M{x) seems to remain difficult. 

2. The martingale central limit theorem 

We begin by applying some observations of Blei and Janson [1], made by them in 
the context of a general Rademacher chaos of fixed order. Let P[n) denote the largest 
prime factor of > 2, and for prime p write 

(x) := /H- 

Ti < X , ijj ( n ) — fc , 
P(n)=p 

Also let (J-p) denote the natural filtration of the random variables ep, so that J-p is the 
sigma algebra generated by {^q '■ q < p} ■ 
We have the decomposition 

p<x 

Since occurs in each summand of Mp^\x), and not in m'!^\x) for any q < p, the 
conditional expectation 

E(Mf(a;) | {e, : g < p}) 

vanishes, so ^Mp'^^(x)^ is a martingale difference sequence relative to ((J-p),P). 

Like Blei and Janson pQ, we will use the following version, essentially Corollary 2.13 
of McLeish [H] , of the martingale central limit theorem. 

Central Limit Theorem 1 (McLeish, 1974). For n G N, suppose that G N, and 
that 1 < i < kn is a martingale difference sequence on (fi, J-", (J-'j P). Write 

Sn '■= Yli<k -^i,n> suppose that the following conditions hold: 

(i) (normalised variances) J2i<kn ^-^In ~^ ^ as n oo; 

(ii) (Lindeberg condition) for each e > 0, we have J2i<k„^(-^i,n^\Xi „\>e) as 
n — > oo; 

(ni) (cross terms condition) limsup„^^ T.i<k„ Ej<fe„,jyi EX^^X^,, < 1. 
Then Sn converges in distribution to N{0, 1) as n ^ oo. 



We will apply this to the normalised random variables 

Mjl'\x)/^/mm{x)^, 
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where x (which we can obviously allow to tend to infinity through integer values) takes 
the place of n, and primes p play the role of the indices i. In fact, since 

m<.x ,uj(m)=k n<x,Lu{n)=k 

= "^{n < X : n is squarefree, = k}, 

and similarly for KMp''\x)'^ , the normalised variances condition holds trivially (i.e. not 
just in the limit, but for each x individuall}]^!) . 

We comment briefiy on the conditions of Central Limit Theorem 1, although the 
reader is referred again to McLeish's excellent paper [H] for further discussion. The 
Lindeberg condition forces that individual summands Xj „ are "asymptotically negligi- 
ble" ; the collective force of the conditions is, roughly, that 

E ^ Xl^^ - 1 1 ^ as cx), 

\i<kn / 

so that the sum of squares concentrates around the desired variance in the limit. The 
failure of this behaviour will feature prominently in §6. 



3. Some number-theoretic estimates 

Next, we record some estimates for the quantity of (squarefree) numbers with a fixed 
number of distinct prime factors, which will be needed repeatedly. We write Q{n) for 
the number of prime factors of n counted with multiplicity, so e.g. ^2(12) = 3. Thus 
u{n) = Q{n) if and only if n is squarefree, and this is how the notation Q{-) will be 
used. 

The following is a standard upper bound: 

Number Theory Result 1 (Hardy and Ramanujan, 1917). There are absolute con- 
stants A, B such that, for all k > 1 and x >2, 

Axiloglogx + Bf-^ 

i^{n < X : uj{n) = k} < — ^ . 

[k — l)!(loga;) 

This is due to Hardy and Ramanujan [TU], and the reader may wish to note (although 
we do not need to do so here) that analogous uniform bounds do not hold when counting 
prime factors with multiplicity, whilst in many other situations there is essentially no 
difference between these cases. 

We cite the following lower bound from the text of Montgomery and Vaughan [15] ; 
it is easily implied, for example, by Exercise 4 of their §7.4. 

■^Here we have only used that the random variables f{n) are orthogonal, i.e. ^f{m)f{n) = unless 
m = n. 
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Number Theory Result 2 (Sathe, Selberg, 1954). There is a constant 6 > such 
that, for X > 3 and 1 < k < log log x, 

#{„, < X : = = k] > 

[k — l)!(logx) 

Montgomery and Vaughan [15] present Selberg's proof, refining one of Sathe, of 
asymptotic formulae for the quantities in Number Theory Results 1 and 2. This is more 
than we need, but the author knows of no earlier proof obtaining lower bounds on a 
comparable range of k. 

In §§5-8, we will need some variants of these two results. We begin by presenting an 
'elementary' lower bound of Pomerance |17j : 

Number Theory Result 3 (Pomerance, 1984). There is an absolute constant C such 
that, if X > C and log log a; (log log logo;) ^ <k< log a;/ (3 log log x), we have 

#{n < X : u{n) = n{n) = k} > gfc(iogL+iogL/L+o(i/L))^ 

k\ iogx 

where L = L{k,x) := log log x — log/c — loglog(fc + 1). 

We will use this to verify that i^{n < x : u{n) = Q{n) = k} is much larger than some 
error terms that will be subtracted from it. Observe that the upper bound restriction 
on k is impressively large, differing by a bounded factor from the maximum for which 
the left hand side is non-zero. 

In their 1988 paper, Hildebrand and Tenenbaum [TT] give various estimates for < 
X : u{n) = k} on the range 1 < A; < log x/ (log log x)^, which appear to include all 
previous results on that range. We will make substantial use of slight modifications of 
these; but it is fairly easy to see that the methods of Hildebrand and Tenenbaum [TT] 
imply the results. (Most of their arguments carry over without change, once one checks 
that suitable parts of their Lemmas 1 and 2 hold in the modified situations). 

Hildebrand and Tenenbaum [TT] use their asymptotic for i^{n < x : u){n) = k}, 
which involves two 'implicit' parameters p = p{x,k) and a = a{x,k), to "...get rather 
precise information on the local behaviour of the function" . In an exactly similar way, 
we have: 

Number Theory Result 4 (Hildebrand and Tenenbaum, 1988). There are absolute 
constants C,6 > such that, if x > C and I < k < 5 log x/ (log log x)^, 

#{n < X : uj{n) = fi(n) = k + l}_L/^^ f^ogL 



< X : uj{n) = f2(n) = k} k \ \ L 

where L is as in Number Theory Result 3. // 1 < A < x, also 

#{n < Ax : u{n) = n{n) = k} _ ^ A ^ log A \ ^^^^^"^ ^og/L+k log l log x/(l^ log x)) 
#{n < X : uj{n) = il{n) = k} \ logx / 
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Finally, we state a rather specialised variant of Corollary 1 of Hildebrand and Tenen- 
baum pj^j; it is possible to give an explicit expression for the function h = h{k, x) that 
appears, but we simply note that h = 0(log^(2 + fc)/(loglogx)^). 

Number Theory Result 5. Let R> 0, and let V be a set of prime numbers. There 
are absolute constants C,6 > such that, if x > C and 1 < k < 5(loglogx)^, 

i^{n < X : uj{n) = Q{n) = k, p \ n\/p E V} 

= Gik) TT f 1 + ' x(loglogx)^^_,,/2 (i^oi ^ ^ +o( 
peiv PJ {k-l)\\ogx \ \\og\ogxJ \ 

uniformly for < R. Here k is as in the Introduction, and 



(log log a;) 2 



^ ' p prime ^ i. / ^ 

When thinking about the function (^(-z), the following estimate will occasionally be 



useful 



^: uniformly for 2; > 2, we have 

— log zG z = 2^ ^ + log 1-- 

dz Viz) ^-^ \p + z \ p 

p prime ^ ^ 



= — log(2; log 2) — 7 + O 
where 7 ~ 0.577 is Euler's constant. 



1 



log z 



4. Proof of Theorem 1 

4.1. Some opening remarks. Before embarking on the proof of Theorem 1, we give 
the promised motivation for studying the sums M^^\x). One reason is the connection 
with the large body of probabilistic literature on Rademacher chaos: it is usual to study 
chaoses of fixed order A;, cf. the papers of Blei and Janson [1] or of Nourdin, Peccati 
and Reinert [16], although one does not typically let k = k{x) tend to infinity. When 
Hough [12] studies M^^\x), he wishes to exploit that "...for k slowly growing compared 
to X, numbers having only k prime factors almost always have a [very large] prime 
factor...". However, there is a third justification for studying M^''\x), which appeals to 
the present author rather more. 

Fix X and k, and let / be a random variable independent of all the e^, having the 
discrete uniform distribution on 

{p < X : p is prime}. 



^This is modified, and slightly corrected, from p 498 of Hildebrand and Tenenbaum's paper . 
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Define a new random multiplicative function f'{n), by replacing ep with an independent 
copy e'p when I = p, and leaving the other eg unchanged. If we write N^'^^x) := 
^n<a: aj(n)=fc /'(^)' then the pair (M^''\x), N^''\x)) has two important properties: 

(i) (exchangeability) for all sets C, -D C Z, 

P(M('=)(x) G C and N^''\x) e D) = ¥{M^^\x) G D and A^('=)(x) G C); 

(ii) (regression) writing (as usual) ti{x) := 4i^{p < x : p is prime}, 

E(ArW(a;)|M('=)(x)) = f 1 - M('=)(x). 

We leave the reader to verify these, and that an analogue of the regression property 
would not hold if one attempted a similar construction for e.g. the complete sum M{x). 
We will not use the exchangeable pair construction here, but it is often useful to have it 
available: see, for example, the application to normal approximation in §3.3 of Nourdin, 
Peccati and Reinert's paper [T6] . 

The proof of Theorem 1 is somewhat technical, so first we highlight what the im- 
portant steps will be. In §4.2, we will reduce the proof to showing that a certain sum, 
counting quadruples of integers according to their 'squarefree parts', is of small order 
when k{x) = o(logloga;). 

We will split this sum, fixing the number of prime factors of the squarefree parts, and 
in §4.3 will bound the subsums. The details are rather heavy, but there are many sym- 
metries inherent in our counting procedure, and we try to give a conceptual explanation 
of this. The results of §4.3 will be valid for arbitrary k{x). 

In §4.4, we will finish the proof just by summing the upper bounds previously ob- 
tained, and using the assumption k{x) = o(loglogx). At the beginning of §5, we give a 
heuristic justification for why (the proof of) Theorem 1 works out, which we hope the 
reader will also find helpful. 



4.2. Preliminaries to the proof. Recall that, to deduce Theorem 1 from Central 
Limit Theorem 1, it remains to verify the Lindeberg and cross terms conditions. As 
noticed in general by Blei and Janson [T], for any e > we have 

^ \^EMW(x)2 |A4''{x)|/VEAfW(x)2>. y - g2 \^(EMW(x)2)2 

Thus the Lindeberg condition certainly holds if, as x — )■ oo, 

^¥.M^^\xY = o(#{r2 < X : u{n) = Q{n) = k}^). 

p<x 
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In order to verify the cross terms condition, we must also study 

a<.x,uj{a)=k, b<x,uj(b)=k, 
P(a)=p P(h)=p 

where p and q are primes. Introduce the notation 

Sp,k,x := {n<x: u{n) = fl{n) = k, P{n) = p}, 

and write Sk,x for the larger set without any restriction on P{n). Notice that we can 
write EM^ ^(x)2Mf as 

E #{(a, b) e Sl,^, : siab) = m}#{(c, d) e Sl,^, : s{cd) = m}, 

m<x2,a;(m)=f2(m) 

where s{x) denotes x divided by its largest square factor, e.g. s(120) = 30. 

The m — 1 term in the sum is ^Sp^k,x ■ H^Sq^k,x] summing this over all p and q yields 
{'#Sk,xY, whilst for k — o(loglogx) we have (if k > 2) 

^{#Sp^k,xT < ^i#Sk-i,x/pf = o (—^^jjrr^i^ — E 

'a;^(log logx + BY^~'^^ 



p<x p<x V (^--2)!^ j^yyog\x/p+l) 



O 



{k-2y?\og^x 

k.x) 



by Number Theory Results 1 and 2. Thus we will have estabhshed the Lindeberg and 
cross terms conditions if we show that the m — 1 term gives the main contribution, i.e. 
that 

fc-i 

EE E ^) ^ ^lk,x ■ = m}i^{{c, d) e Sl^^^ : s{cd) = m} 

p,q<x W=l m<x'^ ,ijj{rn)=Q.(rn)=2W 

is o{{^Sk,x)'^) as x — )■ oo. Values of m with an odd number of prime factors cannot 
arise as values of s{ab) here: this will be clear when we begin to estimate the sums. 

We must record some technical estimates that will be needed to complete the above 
programme. These are encapsulated in the following result. 

Technical Lemma 1. Let a e N, n e NU {0}, C > 0, and 2 < m < M . Also let 
t G N U {0}, and suppose that t < Dlog{N/m), where D > and N > 3m. Then the 
following hold: 

(:\ (loglog(m/j+2)+C)" ^ f) { n!(loglogm+B)°-i \ . 

V) Z^^<j<m,u}{j)=a jlog2(m/j+l) \ (a-l)!logm J> 

(^^\ \^ (loglogO-+l)+C)" in+a-iy. \ . 

l^3<m,i^{j)=a jlogj ~ \ )' 

(WA V 1 ^ f)( M(log log M+i?)°-i 

V^^l l^m<j<M,uj{j)=a \og^{l+j/m) \{a-l)\ log M log^ (1+M/m) 
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where the constants implicit in the "big Oh" notation depend at most on C,D. 

An interested reader will find sketch proofs of these in the appendix. 

4.3. Bounds for the sums with W fixed. In this section we will prove: 

Lemma 1. Uniformly for x>3, l<W<k — 1; and with the conventions that 
(—1)!, (—2)! = 1, and that the third summand is omitted when j = 0; we have 

EE E ^) e ^Ik,. ■■ siab) = m}#{(c, d) e Sl,^, : s{cd) = m} 

p<x q<p m<x^ ,u){m)=n{m)=2W 

( / {2j - 2)\{2k -2W- 2)!(loglogx + Bf^-''^-'' 

(2j - 2)\{2W - 2j - 2)!(loglog3; + E)2fc-2ty-2 
+ {W-j + 

{2W - 2j - 2y.{2k - 2W - 2) ! (log log a; + Bf^-^ 



(w-j-iy.^U-iy.' 

Shghtly extending the notation of §4.2, we define 

S<p,k,x {n <x : iu{n) = fl{n) = k, P{n) < p}, 

and similarly define S^p^k^^- Our first important observation is that if squarefree m 
satisfies uj{m) — 2W, then 

#{(a, 6) e Slk,x ■ 4ab) =m} = #{(a, b) e 5<p,fc_i,^/p : s{ab) = m} 

— ^ ^ ^>S'<p^fe_l_^^min{a;/j)(i,a;d/pm} ■ 

d\m;d,7n/ d<x /p; 
w(d) = W 

The reasoning here justifies our remark that we only need consider m with an even 
number of prime factors: if s{ab) — m we must have a = yd, b — y{m/d) for some d\m 
and some y, (since a, b are squarefree), and since uj{a) = uj{b) then also uj{d) = uj{m/d). 

For the next few lines we will write S{p, d, m) as shorthand for the terms in the sum 
over d] we do not continue to write that we sum over squarefree integers, although we 
occasionally make use of this fact. We see the sum over m in the statement of Lemma 
1 is at most 

EE E S{p,d,m) ■ S{q,e,m) 



d<x/p, e<^x/q, in<xd/ p^xe/ q; 
ui(d)=W uj(e)=W uj{m)=2W;[d,e]\m 



W 



-E E E E E S{p,hd',hd'e'm')S{q,he',hd'e'm'), 



j=0 h<x/p, d' <x/ph, e' <x/qh, m' <x / pe' ,x / qd' , 
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where [d, e] denotes the least common multiple of d, e, and we reorganise the summations 
according to the highest common factor h of d,e (putting d = d'h, e = e'h, and m — 
m'de/h). 

The expression above looks difficult to work with, because of the many sums over 
different ranges. However, by moving the sum over h to the middle we obtain 

w 

J2 J2 J2 Yl S(p,hd',hd'e'm')S(q,he',hd'e'm'), 

j=0 d^<-x/p, e'<x/p, h<.x /pd' ,x /qe\ <x / pe^ ,x / qd^ , 

ui(d') = W-j uj{c') = W-j i^(h)=j uj(m')^j 

where the ranges of summation over d and e are the same. (We can restrict the variable 
e' to be smaller than x/p, rather than x/q, because for larger e' the sum over m' 
is empty). Summing this over q < p, with the observation that S{q, he' , hd'e'm') — 

H^Sq^k-W,rain{x/he' ,x/d'm,'} i yields 
W 

Y J2 Y H E S{p,hd',hd'e'm')i^S<,,, 

— W,mhi{x/he' ,x/d'm'} ■ 

j = d^'Cx/p, e'<a;/p, h<x / pd\x / , m' < x/pe' ,x/d' , 

Lj(d')^W~j Lj(e') = W-j (^(h)=j t>j{m')=J 

(The constraints on h and m' that involve q translate into constraints in the summation 
over q, which are seen to be redundant). Now summing over p < a;, we conclude that 
the left hand side in Lemma 1 is at most 
w 

E E E E E '^^k-W,min{x/hd',x/e'm'}#Sk-W,min{x/he',x/d'm'} 

j=0 d'<x, e'<a:, hKx/d'^x/e', m'<x/e',a:/d', 

u(d')=W-3 u(<i')=W-i wih)=j u>(m')=j 

W 

< 4 ^ E E #^k-w,x/hd' 

j=Q d'<x, e'<d', h<x/d>, \ m> <e' h / d' , &> h/ d' <m' <h, 

using the symmetry of d' and e', and of h and m', that emerges. 

(To understand the above argument qualitatively, the reader may find it helpful to 
visualise an annulus, representing a number m, split into four sectors: a j prime factor 
part, representing h = {d,e); two W — j prime factor parts adjoining this, representing 
d' = d/h and e' = e/h; and a j prime factor part representing m' = m/hd'e' = hm/de. 
The symmetry that we have manifested is of the pair h, d' with the pair m', e'; for 
example, although we have d'h — d < x/p and e < x/q, (which may be larger), we also 
have e'm' — m/d < x/p.) 



( 

-W,x/he' —W,x/d''n 



In order to reduce the bound we obtained to the form asserted, we will use Number 
Theory Result 1 and Technical Lemma 1. It is easiest to treat the terms where j = 
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and j = W separately; both of thes^ are 
which is 

/ 



<x,u}(d')=W 



jthSk-W,x/d'Y ^e'<d' ,ui{e')=W ^ 



o 



X 



{k-W -l)\\W d'\og^{x/d' + l)\og{d' + 1) 



(loglog(x/c/' + 1) + 5 



^2k-2W-2 



(loglog(rf' + l) + S) 



W-l 



\ 



Sphtting the sum at d' = a/x, and applying the first and second parts of Technical 
Lemma 1, we obtain the j = terms in the statement of Lemma 1. 

Now fixl<j<W — 1, and consider the second part of the bound, where we sum 
over large m'. This is x"^ /{k — W — 1)!^ multiplied by 

^ 1 ^ {loglog{x / hd' + 1) + B)''-^'^ 



O 



d'<x, 



h<x/d', 



hlog{x/hd' + 1) 



h/d'<m' <h, 
U}(m')=j 



(loglog(x/m'rf' + + B) 
m' \og{x /m'd' + 1) 



k~W~l 



\ 



E 1 



e,' <d'm' /h, 
ui{e')^W-j 



( 



o 



d'<x, 
o{d') = W-j 



(loglogK + l) + 5)^-^^^ ^ (loglog(x/M' + l) + 5) 



k-W-1 



{w-j- ly.d' 



h<x/d', 



\og\x I hd' + 1) 



E 



h/d'<m'<h, 



([og\og{x / m' d' + 1) + B 
\og{d'm'/h + 1) 



\k-W-l 



I 



where in the first bound we move the sum over e' to be performed earlier, because no 
summand is a function of e except through its range of summation. In the second bound 
we overestimate (loglog((i'm'//i + 1) + B)^^^^^ and 1/ \og{x/m'd' + 1) by their values 
at the m' = h end of the range of summation. We expect that most of the contribution 
to the sum over m' will come from this end, since the factors in the summand that 
decrease with m' do so slowly, and we will see that we do not lose much by doing this. 

If uj{n) = a for some n < y, then certainly a < logy/ log 2, so in estimating the sum 
over m' we can assume that k — W < \og{x /hd')/ log 2. Then by the Cauchy-Schwarz 
inequality, and the third and fourth parts of Technical Lemma 1, the above is 



O 



I y 



(loglog(c;^ + 1) + g) 
d'\og{d' + l) 



The fact that the j = and j — W contributions are the same, although it can be seen directly, also 
follows from the "counting around an annulus" description. In general, the j and W ~ j summands in 
Lemma 1 are the same. 
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{\0gl0g{x /hd' + 1) + 5)2'=-2'^-2(loglog(/l + 1) + By-^ 



E 



AW(x/M' + l)log(A + l) 

where the value of B is possibly increased by a fixed amount from that in Number 
Theory Result 1. An exactly similar argument shows that this expression, multiplied 
by x'^/{k — W — 1)!^, also majorises the first part of our original bound (sum over small 
m'). 

Splitting the range of summation over h at -^/x/d', and applying the first and second 
parts of Technical Lemma 1 as before, we find this is at most 

(2k -2W- 2)! „ (\ogloe(d' + 1) + Br-i-' , 

+ rf'log(d' + l)V(./d' + l)""«'°«<^/'' + 1' + ^) 

Estimating the sums over d' in the same way proves Lemma 1. 

Q.E.D. 



2 



4.4. Conclusion of the proof. In this section wc will deduce Theorem 1 from Lemma 
1, the preliminary observations in §4.2, and Number Theory Result 2. In view of 
these facts, it will suffice to show that the following are o(l) as x oo, when k{x) — 
o(log log a;): 



^ {k - l)\\2k - 2W - 2)\ (2j - 2)!(IogIog.T + B)2M/-2j-2fc ^ 
{k-W-l)\^ ^ {W-j- - 1)!2 ' 

' (A;-l)!2(logloga; + 5)-2W^'^'(2j-2)!(2iy-2j-2)! 



{k-W-iy.'^ ^ (W^-j-l)!2(j-l)!2 ■ 

The third sum in Lemma 1 is bounded above by the first, which can be seen, as remarked 
earlier, by replacing j by 1^ — j and adding the terms in reverse order. 

We illustrate a suitable argument, which is just straightforward analysis, for the first 
expression. In the inner sum, the ratio of the j + 1 and j summands is 

(2j)!(logloga; + 5)2^-2j'-2fe-2(ty-j-l)!2(j_ 1)12 ^ ^(^w ~ j - if 



(2j -2)!(logloga; + 5)2W^-2j-2fe(vi^_ j _2)!2j!2 - (log log a; + 5)2 ' 

Since W — j — l<W<k, this is smaller than 1/2 for x large enough (depending on 
how quickly k{x)/ log log a; tends to with x), and the inner sum is then at most 

2(log log a; + 5)2^-2'= 
(W^- 1)!2 ■ 



16 



ADAM J HARPER 



Treating the outer sum in the same way, we find it is dominated by the W = k — 1 term 
when X is large; and that term is 

2{k - l)^(loglogx + fi)"^ = o(l) as x oo. 

To deal with the second expression, we just note that 

(2j-2)!(2iy-2j-2)! ^ ^ 

^ (w-j-mj-iy.^ - ^ 

and then bound the outer sum as before (the W = 1 term proving to be the dominant 
one). 

Q.E.D. 

5. Proof of Proposition 1 

We notice that 



p<x P^x P^x q<p 

= ^p''^ (a;)'M« {xf -2j2 ™f ) 

p<x q<x P^x 

since each term in the expansion of ^{Ylp<x M^\x)Y is non-negative. Referring to the 
calculations in §4.2, we se^ that the term m = 1 there produces the "main term" 3 in 
Proposition 1, and so to prove Proposition 1 it will suffice to obtain analogous lower 
bounds for any terms from 

p,q<x W=l m<x'^ ,oj{m)=n{m)=2W 

Because we are trying to establish lower bounds, we will not be able to omit co-primality 
or squarefree-ness conditions in our computations; so we must choose terms a little 
carefully to obtain expressions that we can usefully work with. 

When k = o(loglogx), we know from §4 that the W = term dominates the whole 
of the above sum. Roughly speaking, this is because m cannot often have small prime 
factors, since numbers a,b satisfying uj{a) = u{b) = k typically do not; and insisting 
that it should have certain large factors (when W > 1) greatly reduces the possibilities 
for a, b. For larger k this reasoning fails, and e.g. the W = 1 term becomes comparable 



^One also needs to check that J2p<x(^^p,k,x)'^ = o{{#Sk,x)'^) on the whole range of k in Proposition 
1, and not just when k — o(logloga;). This follows because, for example, 

^(#'5'p,fc,K)^ = ^ {#Sp^k,xY < #Sk-l,x/k ^ #Sp,k,x ^ #Sk-l^x/k ■ #Sk,x^ 
p<.x k<p<x k<~p<x 

where #Sk^i^x/k — o{^Sk.x) by Number Theory Result 4. 
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with, and eventually much larger than, the W = term. We can write the W = 1 term 
explicitly and simply, as 



E E E mipa'r,pa's) E J#{(gc'r, qc's) G 

p,q<X r<min{p,q}, s<r, 

r is prime s is prime 

4 E E ih{t^Sk-i,.,r:P{t)>r,r\t,s\t}\ 



r<x. s<r, 
r is prime s is prime 



recalling that P{t) denotes the largest prime factor of t. 

On a range 2 < k < i? log log x, for any fixed R > 0, the single pair (r, s) = (3,2) 
gives a simple lower bound for the W = 1 term. Recalling the notation k = k/L from 
the introduction, when x is large enough (depending on R) we have k < 3R/2. By 
Number Theory Result 5, then 



(A;-2)!log(x/3) 
f k-1 \ x(loglogx) 
Vloglog^y ^(A;-l)!loi 



j/c-l 

logx 

> fc#{n < X : w(n) = Q{n) = k}. 



In particular, if e > and e log log x < k < i? log log x, then for x large enough the 
above is 

^e,R < X : a;(n) = Q{n) = k}. 

When k < (log log a;)^'^ is larger, we must include more pairs of primes when seeking 
a lower bound. Qualitatively, numbers t < x with many more than log log x prime 
factors are almost always divisible by small primes like 2 and 3, so we need to look at a 
range of larger primes r, s. We will need a de Bruijn-type estimate, given in Chapter 7 
of Montgomery and Vaughan [TJ]: for any e > 0, and y sufficiently large depending on 
e, one has 

#{r<y:P(r)< log' y}<yl/2+^ 

Assume that k > log log x, so A; > 1. Combining the estimate with Number Theory 
Result 5, we find (with much to spare) that when r < 4k, and x is larger than an 
absolute constant, 

#{t G : P{t) > r, r 1 1, s 1 1} > #{t G Sk-i,^/r : r f t, s f t} - #{t < x/r : P{t) < r} 

= (1 + o(l))#{t G : r 1 1, s 1 1}. 
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This implies that if a pair of primes r, s satisfies 2k < r < Ak, k < s < r, then 

1 \ \ \ a;(log log x)^"^ 1 



#{tG^,_i,,/,:P(t)>r, rtt,stt} » G \^k { 1 + O { - 

> G(k) 



kJJJ {k-2)\\ogx (r + k) 
x(loglogx)'^~^ 1 



(fc-2)!logx (r + k)' 

The second inequality uses the logarithmic derivative estimate at the end of §3. 
We now sum over all such (r, s), using the Chebychev-type estimate 

0.9212?/ + (log y) < log p < 1.1056?/ + 0(log2?/) iiy>2, 

p prime, m£N 

and the fact that if y > 1 then there is a prime on the interval {y, 2y). See Chapter 2 
of Montgomery and Vaughan [15] . It follows, as claimed, that the W = 1 term is 

^ - 2x2(loglogx)2^ r / k 

^ ' {k-2y?\og^x {r + kf\ogr \\og{k + 2) 

r is prime 

When k < 5 log x/ (log log x)^ is even larger, we do not estimate #{t G Sk-i,x/r 
P{t) > r, r \ t,s \ t} so precisely. Instead, we rewrite the W = 1 term as 

4 ^ ^ ^ lr<P(t),P(M); r\t,u ^ '^s]t,u 

t£Sk-l,xU£Sk^-L,x r<x/t,x/u, s<r, 

r is prime s is prime 

lp(t),P(M)>3A:IogA: '^r\t,u Isfi.u 

teSk--L.x/3klogk-^'^Sk-l.x/3klogk r<3fclogfc, s<r, 

' ' r IS prime s is prime 

> I E lp(t)>3fclogfc I , 

Y*G'S'fe-l,a:/3fc log k 

in view of the Chebychev-type lower bound for the sums over primes. (Here we noted 
that at most 2A; — 2 of the primes r less than 3k log k are excluded by the presence of 
the indicator function.) 

As before, but this time using Number Theory Result 3, we find 

E lpW>3.1ogfc = (l + 0(l))#5fc 

— l,x/3k logk- 

^^^k — l,x/3k log k 

Using both parts of Number Theory Result 4, and assuming as always that x is large 
enough, this gives a lower bound for the W = 1 term that is 

> — YTi.H^Sk~i,xY > { TT7\ r#5'fc,x ] > { -. rifSk,x 

log k \W\ogk J \logk 

as asserted in Proposition 1. 
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Q.E.D. 



6. Proof of Theorem 2 

6.1. Strategy of the proof. To establish Theorem 2, we will use a general fact about 
convergence in distribution of random variables: if X^X are real valued random vari- 
ables, and Xn -4 X as n — )■ oo, then 

¥.g{Xn) lEfi'(X) as n CO 

whenever (7 : M — i- M is continuous and hounded. This is sometimes given as the definition 
of convergence in distribution. 

In particular, if a > is fixed, and Xn -4 A^(0, 1) as n — t- 00, then 

Emin{X^,a2} -> 1 - ^Le'^'/^ + 2(0^ - 1)(1 - <l>(a)) as n ^ 00, 

V 27r 

since (27r)-i/2 /~ y^e~y"/^dy = {2Ti)-^l^ae~''" + (1 - <l>(a)). 

With this in mind, it will suffice to find numbers a > and Ta such that > 
-^e-'^^l^ - 2(a2 - 1)(1 - $(a)), and 

Emin{M(^')(x)^a2} < 1 - T„ 

when X is large (and e log log x < k{x) < i? log log x). It may not be immediately clear 
that there is any sensible way to go about this, but general considerations at least 
suggest that the expression on the left should capture important information about 
M^'^\x). As remarked in §2, a normal approximation does hold for M^'^\x) essentially 
when a sum of squared increments converges in probability to 1. This quantity is closely 
related to the square of M^'^^x), e.g by Burkholder's inequality, as expounded in Hall 
and Heyde's book [9]. 

To proceed further, observe that if q is any prime number then 

Emm{M^^\xy,a^} = E (^E(min{M('=)(a;)^ a^} | 62, eg, e,)) 

< E(min{E(M(*^)(x)2 | £2, €3, e,), a^}) 

= l-E(max{E(M('=)(x)2 | €2, es, e^) - a^, 0}) 

< 1 - P (E{M^'\xy I £2, es, e,) > + l) . 

For given x, if we chose q > x then the first inequality would be an equality. This sug- 
gests that computing E(M(^)(a;)^ | 62, £3, e^) as a function of €2, will be difficult 
when q is large, (it amounts to explicitly determining the distribution of M^'^^x)'^), and 
we shall not attempt this. 
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However, for any given g, fixed before seeing what happens when a; — ?■ oo, the 
computation becomes more feasible. In §6.2, we will carry this out to obtain an explicit, 
although not extremely enlightening, answer. Given this, there are two obvious ways to 
try to finish the proof of Theorem 2: 

(i) calculate the value (as a function of x, k) of E(M*^^)(a;)^ | 62, €3, e^) for every 
possibility ep = ±1, for some small values of q, and see if this leads to values 
a, Ta with the desired properties; 

(ii) choose values of Cp that allow for good estimates of E(M*^'^)(x)^ | €2, €3, e^) as 
q becomes large, and then vary q and a until one can obtain suitable Ta- 
li is the second approach that leads to a proof of Theorem 2, as will be shown in 
§6.2 (just looking at E{M^''\x)^ | ea = 1,63 = l,...,e, = 1)). The first approach 
yields partial results, which are of some interest in that explicit numerical bounds for 
Emin{M*^'^''(x)^, a^} are obtained, and we will discuss this briefly in §6.3. 

The idea of using conditioning to explore the behaviour of M^''\x) seems, to the 
author, rather natural, and it was already used in a heuristic way by Hough [12]. He 
performs numerical simulations of the complete sum M(x), both unconditionally and 
conditioning on a small number of ep, and (looking at the empirical distribution func- 
tions) is led to conjecture that M{x) "...looks like a combination of conditional Gaussian 
distributions, whose variances depend on the value of / on the first few primes." 

Hough [12] also explains a heuristic of Chatterjee that M{x) should not, in the limit, 
have a normal distribution: if it did, it seems likely that the distributions conditional on 
€2 = 1 and €2 = —1 would also tend to normality, whilst these have distinct variances. 
The extent of the impact of on M^'^^x), when p is 'small' and k is 'not small', is what 
underlies both of these heuristics, and also the proof of Theorem 2. If the behaviour 
of M^''\x)'^ were very close to, but not equal to, that of A^(0, 1)^, our approach would 
probably not detect this. 

6.2. Completion of the proof. We now implement the strategy proposed in §6.1, 
beginning by deriving a useable expression for E(M('=)(a;)2 | 62, £3, e^). The random 
multiplicative function / will appear in this expression, but only applied to integers N 
whose prime factors are at most q, as 'short hand' for Ilpi^v^p- We will also write 

S<q,<k,x := {n < X : u{n) = ^l{n) < k,P{n) < q}. 

Expanding the square, we see that E(M*^'^)(x)^|e2, ...,eg) is 

E E(/(n)/(m)|e2,...,e,) 
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ui(M)=u(N),M^N 

Using Number Theory Result 5, we can replace the inner sum by 



. P J ' ' ' ' ' {k - I - u{N))\\ogx' 

where the o(l) terms are with respect to the limit process x — > oo, for any fixed q. On 
the range of k treated by Theorem 2, this is 



so for any fixed g, we conclude that 'E{M'^^\xY \ 62, €3, Sq) is 

p<q ^ ^ ^ NeS<g^<k,^ AfeS'<,_<fc_^, 

u;(M)=w(iV),M<JV 

Turning specifically to E{M^''\xy \ 62 — l,es — 1, = 1), it is clear that for any 
fixed q this is at least 

i+2n(i + j)"' E E 

ui(M)=ui(N),M<N 

as X — > 00. Moreover, provided x > q^ — 1 is large enough (also depending on the value 
of e in the statement of Theorem 2), this is at least 

^nO-f)"' E ^ E 1. 

which does not depend on x. Wc shall finish the proof of Theorem 2 by showing that, 
when q is sufficiently large (depending on e and R) and x is sufficiently large, taking 
the preceding expression as + 1 yields 

4^e-«'/2 - 2{a^ - 1)(1 - $(a)) < 2"^ < P (e(M«(x)2 | £2, 63, > + l) . 
Writing p, q, r for prime numbers, observe that ii N < q"^, 

E i = E E 1 = E E 1+ E E 1 

Me5<q, 2, iv-i p<g p<r<min{q',(Ar-l)/p} p<N/qp<r<q N/q<p<q p<r<N/p 

= E Ei+ E El- 

p<N/qp<r<q N/q<p<\/N P<r<N/p 



k-1 
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One could estimate these sums very precisely using standard estimates for the distribu- 
tion of prime numbers, but for us a fairly crude approach will suffice. If 2g < < 
then the first double sum is at least 

,7^/,,/t^<, logglog(iV/g) 
using the Chebychev-type lower bound quoted in §5. We then have 



logglog(Ar/g + 2) 

whenever 7 < N < q"^, since the left hand side is a non-decreasing function of N, and 
when N < 2q the left hand side is simply #5'2,iv_i. 

Using this bound, we find for q sufficiently large that our choice of + 1 is 

V Pj log^ q 

Provided q is larger than some absolute constant, a classical estimate of Mertens for 
^p<2/ reveals that the product is at least (2 logg)~^. See Chapter 2 of Montgomery 
and Vaughan's book [15j, for example. This completes the proof of Theorem 2, with 
much to spare. 

Q.E.D. 

6.3. Computational aspects. Recall that E(M('^)(x)^ | e2, es, eg) is 

P<q ^ NeS<g^<k,x A//eS<^_<fe^„ 

ui(M)^ui(N),M<N 

for any fixed q. Because of the restriction that u{M) = u{N) in the double sum, the 
number of terms that must be evaluated in this expression does not increase too quickly 
if q is increased. 

Assuming, for simplicity, that k = 1; that x > np<29^' large, so the sum over is 
over all squarefree numbers satisfying P{N) < q; and ignoring the o(l) term (which can 
be made negligibly small, for computational purposes, by taking x large enough); the 
author used Mathematica to evaluate the expression for values of q up to 29. This can 
be done at every point of the sample space, but in Table 1 we only present the maximal 
values attained. 

These calculations are enough to establish Theorem 2 when k = 1, and, since the 
expression computed is a continuous function of k, also when l — 6<k<l + 6 for some 
constant 6 > 0. We see 

E (max{E(M('=)(x)2 | e2, eg, £29) - 9, 0}) > J^(11.777 - 9) > 0.0054; 
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E(MW(x)2 I €2 = ... = eg = 1) 


2 


1.000 


3 


1.333 


5 


1.806 


7 


2.472 


11 


3.249 


13 


4.310 


17 


5.603 


19 


7.305 


23 


9.378 


29 


11.778 



Table 1. Conditional second moments, rounded to 3 decimal places. 



and using tables of ^{a), or using Mathematica, 



—=e'^/^ - 16(1 - $(3)) < 0.02660 - 0.02159 = 0.00501 < 0.0054. 
V 27r 



7. Sketch proof of Corollary 1 

First we show that Theorem 1 can be extended, by showing that M^''\x) — M^-'^^x) 
converges in probability to as x — > oo, when k{x) — o(loglogx). Introduce the 
temporary notation 

M^^''\x) ■.^M^^''\x)/^fmm{xf. 

For any e > 0, we have 

P(|M('=)(x)-M(^^)(^)I ^ ^) ^ f (e(M(*=)(x) - M^^''\x)f + ¥.{M^^''\x) - M^^^\x)f^ , 

by Chebychev's inequality. Using Number Theory Results 1 and 2, it is straightforward 
to show that the bracketed term is o(l), as required. 
The extension of Theorem 2 is based on the inequahty 

min{ fc , [2 log log x\ } 

E(M(^'=)(x)2|e2 = ... = e, = l) > ^ E(M»(x)2|e2 = ... = = 1). 

i=min{ [k/2], [log log 1/2] }+ 1 

If k{x) > e log log x, then each i in the range of summation satisfies 

min{e/2, 1/2} loglogx <i< 2 log log x. 



so as in 56.2 the summand is 



\p<q ^ AreS<,2,,2_i,JV>7 Mg5<^,2,jv-i / 

provided x is large enough. This suffices for the result. 
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8. Sketch proof of Theorem 3 



We begin with the extension of Theorem 1 to the more general setting. We will 
write g{n) for our generalised multiplicative function, where we allow non-Radcmacher 
distributions for the underlying random variables Cp, whilst we continue to use f{n) to 
denote a Rademacher random multiplicative function. Letting p and q be primes, we 
observe that 



Splitting the expectation in this way is just looking at m = 1, and at other values of 
m, as in the proof of Theorem 1; we choose not to write it like this because some of the 
subsequent steps will be different. Summing the m — 1 term over all pairs of primes 
p, q, using our assumptions about the e^, we obtain 



\ i=l t<x,<jj{t)=i I 

Similarly to Technical Lemma 1, one can show that (ifl<i<A; — 1< log log x, and x 
is large) the inner sum is 



where Pi is the least integer such that oj^Pi) = i, namely the product of the first i 
primes. Thus the m = 1 term is (1 + o{l)){^Sk,xy, as x — )■ oo with k{x) = o(logloga;). 

As in the proof of Theorem 1, it remains to show that the contribution from 'other 
values of m', when summed over primes p,q, is o{{^Sk^xy). Given a,b,c,d > 2, write 
h{a,b,c,d) for the highest common factor of a/P{a),b/P{b),c/P{c),d/P{d), and put 
a' = a/h{a, b, c, d), and similarly b', c', d'. Then the contribution is at most 




o 




J2 Y ^^'^^'^^"'''''^''^)^%(a')5(&')^(cO^(rf') 



< 



Y.^'Y. E E Ma,,,c4)=i^g{a)g{h)g{c)g{d) 
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E E ma)mf(c)m, 

where the additional multiple of C is needed on the third line because, sometimes, the 
smaller of p and q may divide all of a, 6, c, d. Summing over p, g, and applying Lemma 
1, one discovers that it would suffice if 

^'^^ C^(loglogx + E)^*^-^'-^^-^(loglogP, + B)'-^ _ / (loglogx + P)^^-^^-^ 
^ (i - l)\{k -i-W - l)!2Pi logPi ~ V 

and 

C\2k -2i-2W- 2) ! (log log + B)'-^ _ f {2k-2W -2)\ 
^ (i - 1)!(A; -i-W - iy?Pi\ogPi ~ V {k-W -ly? 

uniformly for 1 < 1^ < A; — 1. Here the constants implicit in the "big Oh" notation will 
depend on the value of C. These results are straightforward to estabhsh, in the manner 
of §4.4. 

The extension of Theorem 2 is less involved. For each prime p, 

1 (E(e2l,2>i/4))2 9 

\ P - I ) 2 ^ ^v>y^> - 2Ee^ - 32C 

It follows as in §6.2 that, with probability at least (9/32C)'^, ¥.(M^^\xf \ €2,63, ...,ej 
is greater than 

^ p) ^ N 2'^W ^ 2'^W ^ 2^W^+^(/ 

ui(M)=uj(N),M<N 

for all large x (depending on 5, which itself must be larger than an absolute constant). 
The reader may check that this is still enough to establish the result, with much to 
spare. 



Appendix A. Sketch proof of Technical Lemma 1 

We sketch the proofs of the four estimates making up Technical Lemma 1. To simplify 
the exposition, we will write 

U := [log ^M/ log 2] , and V := [log(M/m) / log 2] . 

To deal with the double logarithms in a non-trivial way, we require the following 
bounds: if n G N U {0}, C > 0, and x> I, then 
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and ii X > e then 



tlog^t 

These are easily estabhshed by making appropriate changes of variables, and induction 
on n. 

For the first estimate, by Number Theory Result 1 the left hand side is 
/ (loglogm + ^)°-^ (loglog(2-^V^ + 2) + Cy 



(a-l)!logm ^ log^(2-^-iv^+ 1) 



^ I (log log m + A (loglog(2^+i-^ + 2) + Cy 



(a-l)!logm log2(2^-^-i + 1) 

Counting the terms of the sum backwards, we find it is 



whence the result. 

By partial summation and Number Theory Result 1, the left hand side in the second 
estimate is 

^ / (loglog(m + 1) + C)"(loglogm + B^-^ 

\ {a — 1)! log^ m 

p (loglog(t+l) + C)" t(loglog(t + l) + g)"-i ^^ 
^ U tnogt ■ (a-l)!logt 

Unless m < 3, when the estimate is trivial anyway, the first term may be omitted (at 
the cost of increasing the implicit constant, and values of B and C under the integral, 
by some fixed amounts). The result then follows as before. 
The sum in the third estimate is at most 

^ / M(loglogM + ^)°-^ y+^m 

\^ (a - 1)! ^ Mlog(2^+im)log^(l + 2^) 

Removing a factor 1/ logMlog^(l + M/m) from the sum, we are left to bound 

,._v. log(2^+V01og^(l + 2^) ^^(^ f V + l - 
^ log(2^+im) V(l + 20 + 

It is easy to see that this quantity can be bounded independently of m and M, e.g. by 
considering the ratio of consecutive summands. 

For the fourth estimate, it will suffice to show that 

[logm/log2] , /7\r/on , ^ ' 

y-v logm log log(A'/2*) + C 



2[\ogm/\og2]-i(^l + 1) \\og\og{N/m) + C 
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has a bound depending only on C and D. However, the ratio of the i and i — 1 summands 
is at least 



2i ^ log(l-log2/log(iV/y-i)) Y J^L^^f 1 



i + loglog(iV/2*-i) + C J - ^ + lV Vlog(^/2'"^)loglog(iV/m) 

2i ( ^( 1 
> -. — 7 1 + 



i + 1 y ylog(A^/m) loglog(A^/m 

This is more than 5/4, say, provided that i > 2 and N/m is larger than a constant 
depending on D only. If N/m is not so large, we have 

loglog(A^/2^) <^D loglog(m/2^ + 3) and t <c 1- 

The reader can check that the sum over i is dominated by terms with i largest, 

and the result still follows. 

Acknowledgements. The author would like to thank his PhD supervisor, Ben Green, for 
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